Skip to content

fix: catch TypeError when accessing think token properties#1141

Closed
nightguarder wants to merge 2 commits into
jundot:mainfrom
nightguarder:fix/think-token-typeerror-safety
Closed

fix: catch TypeError when accessing think token properties#1141
nightguarder wants to merge 2 commits into
jundot:mainfrom
nightguarder:fix/think-token-typeerror-safety

Conversation

@nightguarder

@nightguarder nightguarder commented May 9, 2026

Copy link
Copy Markdown

Fix TypeError for non_thinking models

Catch TypeError alongside ValueError in three spots in scheduler.py. Without this, /v1/completions crashes on TranslateGemma.

Explanation

Some models (e.g. TranslateGemma) don't have _think_start_tokens initialized, so accessing the think_start_id property raises TypeError instead of ValueError.

The three existing except ValueError blocks miss this, causing engine loop crashes when serving such models via /v1/completions.

Catch (ValueError, TypeError) on all three sites:

  • _detect_needs_think_prefix in scheduler step
  • ThinkingBudgetProcessor think_start_id resolution
  • _resolve_think_end_token_ids think_end_id resolution

What this does

This enables support for TranslateGemma and other non-thinking models whose
tokenizer lacks _think_start_tokens. Previously blocked by TypeError,

Note:
that you still need to update your local translategemma chat template must be applied client-side and the
resulting prompt sent to /v1/completions instead. See original Issue for explanation #879

Files modified

  • omlx/scheduler.py

…nking models

This enables support for TranslateGemma and other non-thinking models whose
tokenizer lacks _think_start_tokens. Previously blocked by TypeError,
TranslateGemma-4b-it now works correctly via /v1/completions with a
client-side chat template.

Note: TranslateGemma uses a custom chat template requiring
source_lang_code/target_lang_code fields that OMLX /v1/chat/completions
does not support. The chat template must be applied client-side and the
resulting prompt sent to /v1/completions instead. See the model's
chat_template.jinja for the exact prompt format.

Catch (ValueError, TypeError) on all three sites:
- _detect_needs_think_prefix in scheduler step
- ThinkingBudgetProcessor think_start_id resolution
- _resolve_think_end_token_ids think_end_id resolution
cangming009 pushed a commit to cangming009/omlx that referenced this pull request May 10, 2026
mlx-lm#1171 changed the MiniMax M2 tool parser to return a list when a
single <minimax:tool_call> block contains multiple <invoke>s. Without
list/dict flattening in api/tool_calling.py, parallel tool calls were
silently dropped via AttributeError swallowed by the existing except.

Flatten parser results in the native path using the same isinstance(list)
pattern already used in the Gemma 4 fallback. Add regression tests for
single-dict, multi-list, and multi-block-multi-invoke cases.

Also picks up BatchKVCache/BatchRotatingKVCache.extend() batch-dim fix
(jundot#1141) and the tree_reduce import fix (jundot#1165) from the same bump.
@jundot

jundot commented May 21, 2026

Copy link
Copy Markdown
Owner

Thanks for the writeup, and you were right that this crashed. On 0.3.5.dev1 the three sites only had except ValueError, so a model like TranslateGemma whose think_start_id raises TypeError slipped through and crashed /v1/completions.

That same bug was already fixed on main in the meantime. The _get_think_token_id helper in scheduler.py now wraps those accesses and catches both ValueError and TypeError, returning None, and all three lines you changed already call that helper.

So your fix and the helper do the same thing, which makes these inline changes redundant on main. I'll close this for that reason. If you still see the crash on the latest main, please open an issue.

@jundot jundot closed this May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants